fix: multi-project routing and recursive exclude patterns#75
fix: multi-project routing and recursive exclude patterns#75PatrickSys wants to merge 10 commits intomasterfrom
Conversation
Non-recursive patterns like 'coverage/**' only matched at the project root, allowing nested paths (worktrees/portal30-pr/coverage/) to leak generated artifacts into search results. Adds '**/' prefix to all indexer exclude patterns and aligns file-watcher and project-discovery ignore lists with the same set of directories.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d1197a9393
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if (availableProjects.length === 1) { | ||
| const project = getOrCreateProject(availableProjects[0].rootPath); | ||
| await initProject(project.rootPath, watcherDebounceMs, { enableWatcher: true }); | ||
| setActiveProject(project.rootPath); |
There was a problem hiding this comment.
Auto-route when a single root contains discovered subprojects
This only auto-selects when listProjectDescriptors() has exactly one entry. After a roots refresh that leaves one configured monorepo root, discovery also adds its nested projects to that list, so availableProjects.length becomes > 1 even though there is still only one root. In a rootless session where the workspace arrives later, the next tool call will therefore fall through to selection_required instead of routing to the sole root until the caller passes project manually.
Useful? React with 👍 / 👎.
| const nearestBoundary = await findNearestProjectBoundary(absolutePath, containingRoot); | ||
| const resolvedProjectPath = | ||
| nearestBoundary?.rootPath ?? containingRoot ?? (stats.isDirectory() ? absolutePath : undefined); |
There was a problem hiding this comment.
Accept file selectors that have no detectable project markers
The new project selector advertises absolute file paths, but this fallback only succeeds when findNearestProjectBoundary() found a marker or the selector itself is a directory. If the caller passes a file from a standalone/exported tree with no .git, .codebase-context, or supported manifest above it, both nearestBoundary and containingRoot are undefined and we return unknown_project. That makes file-path routing unusable in rootless/no-roots sessions for exactly the ad-hoc codebases this change is meant to support.
Useful? React with 👍 / 👎.
| if (directoryNames.has('.git')) { | ||
| return { | ||
| candidate: { rootPath: directoryPath, evidence: 'repo_root' }, |
There was a problem hiding this comment.
Detect
.git files as repo roots during project discovery
Nested Git submodules and linked worktrees usually expose .git as a file, not a directory. classifyDirectory() only checks directoryNames.has('.git'), so discoverProjectsWithinRoot() and findNearestProjectBoundary() miss those projects entirely. In a workspace that contains a submodule without another manifest marker, selecting a file inside it will climb to the outer root and route tools/resources to the wrong project.
Useful? React with 👍 / 👎.
Greptile SummaryThis PR delivers two things: (1) workspace-aware multi-project routing in the MCP server so a single server instance can serve multiple repos/monorepos without one config entry per project, and (2) a fix for index pollution caused by flat (non-recursive) exclude globs that allowed nested Key changes:
Issues found:
Confidence Score: 3/5
|
| Filename | Overview |
|---|---|
| src/index.ts | Core MCP server heavily reworked: adds per-session active-project tracking, LRU watcher eviction (MAX_WATCHED_PROJECTS=5), relative-path and file-path project selectors, project-scoped resource URIs, and a workspace overview fallback. Two issues found: unguarded JSON.parse for error-code routing (can throw on empty string) and side effects inside buildProjectDescriptor. |
| src/utils/project-discovery.ts | New module for workspace/monorepo traversal using marker-based detection. Logic is solid but two issues: isWorkspacePackageJson inconsistently treats packages:[] as a workspace root (unlike the workspaces:[] case), and the walker follows symlinked directories without escaping the trusted root boundary. |
| src/core/indexer.ts | Exclude patterns upgraded from flat globs (coverage/, dist/) to recursive patterns (/coverage/, etc.) and extended with new directories (worktrees, .claude, .nx, .turbo, .next, .cache, .planning, target, vendor). Straightforward and well-tested. |
| src/core/file-watcher.ts | Ignore list aligned with the indexer: adds build, .nx, target, vendor, worktrees, .claude. No logic changes, purely additive. |
| src/tools/index.ts | withProjectDirectory replaced by withProjectSelector which injects both project (new primary selector) and project_directory (deprecated alias) into all tool schemas. Also unconditionally prepends a routing preamble to every tool description, which may affect tool-selection accuracy for AI clients. |
| src/resources/uri.ts | Adds buildProjectContextResourceUri and getProjectPathFromContextResourceUri for project-scoped codebase://context/project/ URIs. Round-trip encoding is tested and correct. |
| src/project-state.ts | Clean factory module (mentioned in PR description as new, though absent from the changed-files list). Exposes getOrCreateProject, getProject, getAllProjects, removeProject, makePaths, normalizeRootKey backed by a module-level Map. No issues found. |
| tests/multi-project-routing.test.ts | Comprehensive test rewrite covering rootless startup, invalid root filtering, lazy indexing, active-project stickiness, monorepo subdirectory selection, file-path resolution, selection_required flows, and project-scoped resource reads. Good coverage. |
| tests/indexer-exclude-patterns.test.ts | New integration test that creates a real temp directory tree with nested coverage/, worktrees/, .claude/, and dist/ directories and verifies only legitimate source files appear in the keyword index. Solid regression guard for the recursive-exclude fix. |
| tests/project-discovery.test.ts | New unit tests for discoverProjectsWithinRoot and findNearestProjectBoundary. Covers marker detection, ignored directories, nearest-boundary resolution, and trust-root escape prevention. Well-structured. |
Sequence Diagram
sequenceDiagram
participant Client
participant Server as MCP Server (index.ts)
participant Resolver as resolveProjectForTool
participant Discovery as project-discovery.ts
participant State as project-state.ts
Client->>Server: tools/call (search_codebase, args)
Server->>Resolver: resolveProjectForTool(args)
alt args.project or args.project_directory provided
Resolver->>Resolver: resolveExplicitProjectSelection()
Resolver->>Resolver: resolveProjectSelector(selector)
alt absolute path or file:// URI
Resolver->>Discovery: findNearestProjectBoundary(path)
Discovery-->>Resolver: DiscoveredProjectCandidate
Resolver->>State: getOrCreateProject(rootPath)
else relative subproject path
Resolver->>Resolver: match against listProjectDescriptors()
Resolver->>Resolver: join with each knownRoot
Resolver->>Discovery: resolveProjectFromAbsolutePath()
Discovery-->>Resolver: ProjectState
end
Resolver->>Server: setActiveProject(rootPath)
else no project selector
alt activeProjectKey is set
Resolver->>State: getOrCreateProject(activeProject)
else one project available
Resolver->>State: getOrCreateProject(singleProject)
Resolver->>Server: setActiveProject()
else multiple projects, no active
Resolver-->>Server: selection_required error
Server-->>Client: {status:"selection_required", availableProjects:[...]}
end
end
Server->>Server: initProject (ensureIndexed + watcher)
Server->>Server: dispatchTool(name, args, createToolContext(project))
Server->>Server: inject {project, index} into JSON response
Server-->>Client: {status:"success", results:[...], project:{...}}
Comments Outside Diff (4)
-
src/utils/project-discovery.ts, line 2063-2076 (link)Inconsistent empty-array check for
workspaces.packagesThe first branch correctly guards against an empty
workspacesarray (parsed.workspaces.length > 0), so apackage.jsonwith"workspaces": []is not treated as a workspace root.The second branch does not apply the same guard.
Array.isArray([])istrue, so apackage.jsonwith"workspaces": { "packages": [] }is incorrectly classified as a workspace root, which forcescontinueScanning: trueand causes deeper (potentially expensive) directory traversal where none is needed. -
src/index.ts, line 906-919 (link)Unexpected side-effect inside
buildProjectDescriptorbuildProjectDescriptorcallsrememberProjectPath()— which updatesprojectSourcesByKey— on every invocation. Functions namedbuild…conventionally return a value without mutating shared state. Here the side effect means that every place that callsbuildProjectDescriptorfor display or error-reporting purposes (e.g. insidebuildProjectSelectionPayloadandbuildProjectSelectionError) silently registers the path as a known project, which could subtly inflatelistProjectDescriptors()and affect subsequent routing decisions.Consider moving the
rememberProjectPathcall to the explicit registration sites (registerKnownRoot,registerDiscoveredProjectPath,resolveProjectFromAbsolutePath, etc.) and makingbuildProjectDescriptora pure getter. -
src/utils/project-discovery.ts, line 2141-2176 (link)Symlinked directories are followed without a depth-guard bypass
fs.readdirwith{ withFileTypes: true }returnsDirententries whoseisDirectory()method follows symlinks. A symlink that points to an ancestor of the current path creates a cycle. Thedepth >= maxDepthcap limits the damage (max 4 levels), but the walker will still descend into every symlinked directory up to that cap, potentially reading large subtrees or escaping the intended root boundary.Consider filtering out symlinked directories before traversal:
.filter((entry) => !entry.isSymbolicLink())
or using
fs.realpathto verify the resolved path stays withinresolvedTrustedRootPathbefore recursing. -
src/tools/index.ts, line 1920-1929 (link)All tool descriptions are unconditionally prefixed
withProjectSelectorprepends"Routes to the active/current project automatically when known. "to every tool's description regardless of whether the tool actually does project routing (e.g.get_indexing_statusorremember). MCP clients and AI agents read these descriptions for tool selection; adding a uniform preamble to all 10 tools makes them harder to differentiate and may degrade tool-selection accuracy.Consider only prepending for tools that perform index-consuming operations (
INDEX_CONSUMING_TOOL_NAMES), or making the description update opt-in per tool definition.
Last reviewed commit: "fix: make indexer ex..."
| ) | ||
| }; | ||
| } | ||
|
|
||
| async function resolveProjectForResource(): Promise<ProjectState | undefined> { | ||
| const availableRoots = getKnownRootPaths(); | ||
| if (availableRoots.length !== 1) { | ||
| const activeProject = activeProjectKey ? getTrackedRootPathByKey(activeProjectKey) : undefined; |
There was a problem hiding this comment.
Unguarded
JSON.parse when routing by error code
resolveProjectSelector reaches into a ToolResponse by parsing its JSON text in order to inspect errorCode. If resolution.response.content?.[0]?.text is an empty string (not null/undefined), the ?? '{}' nullish fallback won't fire — the nullish coalescing operator only substitutes on null/undefined — and JSON.parse('') will throw a SyntaxError that is not caught anywhere in the call chain, crashing the MCP request handler.
In addition, this approach couples routing logic to the serialised wire format: any future change to how buildProjectSelectionError serialises its payload would silently break project resolution here.
| ) | |
| }; | |
| } | |
| async function resolveProjectForResource(): Promise<ProjectState | undefined> { | |
| const availableRoots = getKnownRootPaths(); | |
| if (availableRoots.length !== 1) { | |
| const activeProject = activeProjectKey ? getTrackedRootPathByKey(activeProjectKey) : undefined; | |
| let errorCode: string | undefined; | |
| try { | |
| errorCode = ( | |
| JSON.parse(resolution.response.content?.[0]?.text || '{}') as { | |
| errorCode?: string; | |
| } | |
| ).errorCode; | |
| } catch { | |
| errorCode = undefined; | |
| } | |
| if (errorCode !== 'unknown_project') { | |
| return resolution; | |
| } |
Summary
project-state.ts,project-discovery.ts, and rewiressrc/index.tsto route by root.coverage/,dist/,worktrees/directories leaked into search results because exclude globs lacked**/prefix. Aligns indexer, file-watcher, and project-discovery ignore lists.Key changes
src/index.ts— per-root state management, MCP roots wiring, tool dispatch routingsrc/project-state.ts— new: factory + Map-backed per-project statesrc/utils/project-discovery.ts— new: workspace/monorepo discovery with marker-based detectionsrc/core/indexer.ts— recursive**/exclude patterns + new directories (worktrees, .claude, .nx, etc.)src/core/file-watcher.ts— aligned ignore list with indexerTest plan
npx vitest run— 312 tests pass, 0 failuresnpx eslinton modified files — cleansearch_codebase